Associate Director | Assistant Professor @ Georgetown University

About

I am the Associate Director for the Master of Science in Data Science for Public Policy program (DSPP) and an assistant teaching professor in the McCourt School of Public Policy at Georgetown University. I earned my Ph.D. from the University of Maryland, College Park where I worked under the direction of Johanna Birnir, David Cunningham, Kathleen Cunningham, and Ernesto Calvo.

My research examines the distribution and impact of political violence perpetrated by non-state organizations to isolate plausible policy interventions targeted at reducing the occurrence and spread of conflict. Ongoing projects explore the effect of membership heterogeneity on the strategic use of violent tactics by armed actors; integration of multiple conflict event datasets to improve measurement of violent activity; and the use of live simulated environments to analyze normative behavior.

As a computational social scientist, I develop, utilize, and teach computational tools to help (a) effectively utilize machine learning and computational methods to draw descriptive inferences from data and (b) leverage non-traditional data assets to better understand social processes.

Education

  • Ph.D., Political Science, 2019 | University of Maryland, College Park
  • MA, Political Science, 2016 | University of Maryland, College Park
  • BA, Political Science, 2010 | Beloit College

Interests

  • Political Violence
  • Computational Social Science
  • Conflict Event Data
  • Organizational and Collective Behavior
  • Live Simulated Environments

Research



Integrating Conflict Event Data

Disaggregated studies of conflict typically rely on a single dataset to make inferences. In this project, we advocate integrating multiple datasets to improve measurement and analysis.






Tactical Adaptation

This project examines why some violent non-state actors experiment with and develop a broad repertoire of tactics and targets to achieve their political aims while other groups consistently utilize the same methods across their lifespan.









Live Simultated Environments

We leverage live simulated environments to examine individual and group-level normative and strategic behavior.








Diplomatic Networks

We explore structural breaks in diplomatic meeting networks as a predictor for shifts in foreign policy.








Publications

Peer-Reviewed

2019Integrating Conflict Event Data” (With Karsten Donnay, Erin McGrath, David Cunningham, and David Backer). Journal of Conflict Resolution.
The growing multitude of sophisticated event-level data collection enables novel analyses of conflict. Even when multiple event data sets are available, researchers tend to rely on only one. We instead advocate integrating information from multiple event data sets. The advantages include facilitating analysis of relationships between different types of conflict, providing more comprehensive empirical measurement, and evaluating the relative coverage and quality of data sets. Existing integration efforts have been performed manually, with significant limitations. Therefore, we introduce Matching Event Data by Location, Time and Type (MELTT) — an automated, transparent, reproducible methodology for integrating event data sets. For the cases of Nigeria 2011, South Sudan 2015, and Libya 2014, we show that using MELTT to integrate data from four leading conflict event data sets (Uppsala Conflict Data Project–Georeferenced Event Data, Armed Conflict Location and Event Data, Social Conflict Analysis Database, and Global Terrorism Database) provides a more complete picture of conflict. We also apply multiple systems estimation to show that each of these data sets has substantial missingness in coverage.
2018Cultural Imprinting, Institutions, and the Organization of New Firms” (with David Waguespack and Johanna K. Birnir). Strategic Science.
Do firm founders from nations with more predictable and transparent institutions allocate more autonomy to their employees? A cultural imprinting view suggests that institutions inculcate beliefs that operate beyond the environment in which those beliefs originate. We leverage data from a multiplayer online role-playing game, EVE Online, a setting where individuals can establish and run their own corporations. EVE players come from around the world, but all face the same institutional environment within the game. This setting allows us to disentangle, for the first time, cultural norms from the myriad other local factors that will influence organizational design choices across nations. Our main finding is that founders residing in nations with more predictable and transparent real world institutions delegate more authority within the virtual firms they create.
2014A Voice in the Process: A cross-national look at ethnic inclusion and economic growth in the world” (with Johanna K. Birnir). Development.
Does greater ethnic inclusion into the executive have a positive effect on a country’s economic development? We posit that by allowing for greater diversity in a state’s decision-making process, ethnic populations find their preferences represented and thus are more likely to support enacted policies; at the same time the quality of the policy increases as a greater variety of perspectives are introduced. Utilizing the new AMAR (All-Minorities at Risk) data to capture ethnic diversity, this article offers a preliminary description, suggesting that higher levels of inclusion positively correlate with indicators of economic growth.

Contributions to Books

2017The Geography of Organized Armed Violence Around the World” (with Erik Melander and David Backer). Peace and Conflict 2017, Routledge.
This chapter offers insight into the utility of the latest release of Uppsala Conflict Data Program’s Georeferenced Event Dataset (UCDP-GED). The UCDP has an established record of compiling and disseminating an array of widely used data resources. The field of conflict studies, and the data that contributing scholars collect, have progressively moved toward greater specificity along several dimensions. UCDP-GED records the category of violence, the actors involved, the location and associated coordinates, and the timing of each event, as well as other characteristics. UCDP has been the source of the most widely used data in academic research on violence committed by organized armed actors. In particular, UCDP-GED provides a means for analyses to test micro-level theories. UCDP-GED has paved the way for methodological advances with a major bearing on substantive contributions to the literature.

Media

2019Where a Founder Is from Affects How They Structure Their Company” (with David Waguespack and Johanna K. Birnir). Harvard Business Review.

Software

MELTT: Merging Event Data by Location, Time, and Type: An R package that offers a methodology for systematically integrating disparate geospatial event data by leveraging information on spatio-temporal co-occurrence and event-specific metadata.


Teaching

I primarily teach graduate-level computational social science courses at Georgetown University. As an instructor, I try to balance substance with methodological rigor by training students how to effectively employ computational methods to investigate, analyze, and learn from data to formulate and test theoretically-relevant hypotheses. In my instruction, I match formal computational training with hands-on empirical examples so that quantitative methods are taught in the context where they are applied.

I aim to train students on how to: (i) utilize machine learning methods to explore and generate hypotheses from data; (ii) design and implement statistical designs geared toward effectively inferring causal relationships from observational and experimental data; (iii) synthesize disparate and unstructured data to draw meaningful insights from data related to public policy and political science inquiries; and (iv) visualize data to effectively communicate empirical findings. My goal is to train students to be effective consumers, critics, and producers of computational social science.


COURSE CATALOG

Accelerated Statistics for Public Policy II (PPOL561)

Course taught: Spring 2019, Spring 2020

This is the second course in the two-course sequence on quantitative methods for social science for the Masters of Science in Data Science for Public Policy (DSPP). The course builds on students’ understanding of multivariate regression and introduces advanced, but commonly used, methods of statistical analysis. The course is broadly divided into two part: advanced modeling and causal inference. Instruction will concentrate on how to determine the appropriate econometric approach in addressing various types of policy questions, while highlighting the challenges in isolating causal effects. The emphasis is on applied learning; formal proofs and mathematical rigor are presented but not the principal focus of the course. As part of our effort to teach effective communication skills, students will make presentations about applications using the techniques being studied in class.

Data Science I: Foundations (PPOL564)

Course taught: Fall 2018, Fall 2019

This first course in the core data science sequence for the Masters of Science in Data Science for Public Policy (DSPP) introduces students to the programming and mathematical concepts that underpin statistical learning. The aim of the course is to provide DSPP students with the foundations necessary to grasp the concepts and algorithms encountered in Data Science II and III. Students will cover topics related to linear algebra (with a focus on linear regression and dimension reduction); multivariate calculus (with an emphasis on optimization algorithms, specifically gradient descent); and probability theory (with an emphasis on simulation and sampling). Throughout the course, students will be introduced to the fundamentals of programming and manipulating data in Python. Students will work in Jupyter notebooks and use Git/GitHub to submit coding assignments, developing literate programming and reproducible research skills they will use throughout the program.

Introduction to Data Science (PPOL670)

Course taught: Spring/Fall 2019, Spring 2020

This course teaches Masters of Public Policy (MPP) students how to synthesize disparate, possibly unstructured data in order to draw meaningful insights from data. Topics covered include fundamentals of functional programming in R, literate programming, data wrangling, data visualization, data extraction (via web scraping and APIs), text analysis, and machine learning methods. In addition, students will be exposed to Git and Github for reproducible research. The course aims to offer students a practical toolkit for data exploration. The objective of the course is to equip MPP students with the skills to incorporate data into their decision-making and analysis.


ADVISING

I advise thesis projects for students in the Masters of Conflict Resolution program at Georgetown University.

Current Advisees

  • Ayaka Oishi


Talks

Invited Talks

  • 2019 “Conflict Event History Data and Prediction”, Central Intelligence Agency
  • 2019 “Predicting Conflict Occurrence Using Search History Data”, Facebook

Conference Presentations

  • 2019 “Membership Diversity and Tactical Variation”, American Political Science Association
  • 2019 “Gender Norms and Violent Behavior in a Virtual World”, Politics and Computational Social Science
  • 2019 “Predicting Conflict Occurrence using Search History Data”, International Studies Association
  • 2018 “Advancing Measurement in the Study of Conflict and Political Violence”, Peace Science Society
  • 2017 "A Break from the Past: why mapping deviations in diplomatic networks reveals shifts in foreign policy strategies, International Studies Association
  • 2017 “A Break from the Past: why mapping deviations in diplomatic networks reveals shifts in foreign policy strategies”, Peace Science Society
  • 2016 “An Automated Aggregation of Geo-coded Violent and Non-violent Conflict Events”, Peace Science Society
  • 2016 “Integrating African Conflict Event Data”, American Political Science Association
  • 2016 “MELTT: Matching Event Data by Location, Time, and Type”, The Society for Political Methodology
  • 2016 “MELTT: Matching Event Data by Location, Time, and Type”, Midwest Political Science Association
  • 2015 “Providing to Compete: An Examination of Social Welfare Provisions by Regime Change Movements”, Midwest Political Science Association

Workshops

  • 2019 “Building a professional web presence using R”. (Data Science in Action Seminar) McCourt School of Public Policy, Georgetown University
  • 2018 “A Crash Course in Statistical Computing” (Short Course) College of Behavioral and Social Sciences University of Maryland, College Park
  • 2017 “Applied Statistics and Data Management in R” (Short Course) Smith School of Business, University of Maryland, College Park
  • 2017 “Tools and Best Practices for Integrating Spatial Data” (APSA Short Course co-taught with Karsten Donnay and Andrew Linke) American Political Science Association
  • 2017 “An Introduction To Statistical Programing In R: A short course on processing, analyzing, and visualizing data in R” (Short Course) Creative Associates International, Washington DC
  • 2017 “The ABC’s of Bayesian Estimation in R” (Talk) University of Maryland, College Park
  • 2016 “Learning R Programming” (Short Course) Department of Government and Politics, University of Maryland, College Park
  • 2016 “Web-scrapping and Automated Data Process in R” (Workshop) University of Iceland, Reykjavik
  • 2016 “Web-scrapping and Automated Data Process in R” (Workshop) University of Maryland, College Park
  • 2015 “Functional Programming in R” (Workshop) University of Maryland, College Park
  • 2015 “Real-Time Modeling of Social Protest: Ferguson, Twitter, and the Opacity of Social Media Data” University of Maryland, College Park